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Abstract 



=3 

l~i , The complexity of the Quicksort algorithm is usually measured by the number of key com- 

■ parisons used during its execution. When operating on a list of n data, permuted uniformly 

at random, the appropriately normalized complexity Yn is known to converge almost surely 
to a non-degenerate random limit Y. This assumes a natural embedding of all Yn on one 
probability space, e.g., via random binary search trees. In this note a central limit theorem 
^ _ for the error term in the latter almost sure convergence is shown: 

in; ^ 

\l , where M denotes a standard normal random variable. 
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^ . 1 Introduction and result 

H ! 

, Quicksort, invented by Hoare '5^, is one of the most widely used algorithms for sorting. Given 

a list r = (ui,...,Un,] e K^, Quicksort starts picking a key (i.e., an element), say the first one 
ui , as "pivot" element. The other keys in V are then partitioned into lists r< and r>. Key uj is 
contained in list r< if the "key comparison" between the pivot element Uj and Uj yields Uj < Ui , 
otherwise Uj is contained in list r>, 2 < j < n. Finally, the lists r< and r> are each sorted 
recursively unless their size is or 1 . 

The complexity of Quicksort is most commonly measured by the total number of key com- 
parisons used, although other cost measures have been studied as well. To capture the typical 
complexity of the algorithm it is usually assumed that the ranks of the elements in T form a ran- 
dom, uniformly distributed permutation of {1 , ... , n}. Subsequently this model assumption is met 
by starting with the list T — , . . . , Un), where (Uj )j>i is a sequence of independent and identi- 
cally uniformly on [0, 1] distributed random variables. To be definite about the partitioning phase 
of the algorithm we assume that the order of elements in V is preserved within the lists r< and 
r>, e.g., list r = (4,2,5, 6, 1, 8, 3, 7) is partitioned into the lists r< = (2, 1,3) and r> — (5, 6, 8, 7). 
This property is shared by standard implementations when always using the first element as pivot 
element, see as general reference Mahmoud [9]. 

We denote by Kn the number of key comparisons used by Quicksort to sort the list T — 
(Ui , . . . , Un), Ti > 1 , and set Kq :— 0. In the probabilistic analysis of the complexity of Quicksort 
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often characteristics of are studied that only depend on the distribution C(Kn) of Kn. With 
respect to weak convergence such results are reviewed below. 

However, in the present setting the are constructed on a joint probability space which in 
fact is a formulation via random binary search trees discussed at the beginning of section [2] and 
used in the subsequent analysis. Hence, for (Kn)Ti>o also path properties (in particular strong 
limit theorems) can be studied. Regnier showed that 

K^-E[K^] 

Yn := — ^— j , n > 0, (1) 

is a martingale, which converges towards a random, non-degenerate limit Y almost surely and in 
Lp: 

||Yn-Y|lp^O (n-^oo), 

for any p > 0, where we denote ||X||p :— E[|X|p]^^'/t' for a random variable X. (The case p = 2 is 
exphcitly discussed in [T3].) The mean of Kn. is E[Kn] = 2[n+ 1 )Hn — 4n, where Hn :— ^J^^, 1 /k, 
denotes the nth harmonic number. Another proof for the almost sure convergence of (Yn.]n.>o via 
a Doob-Martin compactification is given in Griibel [S], see also Evans, Griibel and Wakolbinger 

m- 

Rosier |14| gave a proof based on a contraction argument for the convergence in distribution 
of Yn towards Y and found that the limit distribution C[Y] satisfies 

/:(Y) = /:(uY' + (i -u)Y" + c(u)), (2) 

with, for X e [0, 1], 

C[x) :=1 +2xlog(x)+2(1 -x)log(1 -x), (3) 

where U, Y' and Y" are independent, Y' and Y" are distributed as Y and U is uniformly distributed 
on [0,1]. 

The rate of the convergence Yn ~> Y has been bounded, regarding the distributions £(Yn) and 
£(Y), by various distance measures. The minimal Lp-metric £p is given by 

£p (V, W) £p (C[V),C[W)) inf{|| V - W'||p : C(V) = C(W) = C[W')}, (4) 

for all p > and random variables V, W with ||V||p, ||W||p < oo. Note that the infimum in (U) is 
over all joint distributions C[V',W') with the given marginals C[V] and >C(W). Fill and Janson 
[1] obtained for all p > 2 the bounds 

£p(Yn,Y)=0(^-^), £p(Yn,Y)=a^^°S'^ 

as well as the explicit bound ^ilYn., Y) < for all n > 1. 

We denote by Fy the distribution function of a random variable V. Then, for the Kolmogorov- 
Smirnov distance (uniform distance) 

p(V,W) ■.^p[£[V],£[W)) sup|Fv(x)-Fw(x)| 



x6l 



Fill and Janson [4| obtained for all £ > 



p(Yn,Y]=0(n^-i/2), p(Yn,Y)=Q(^l 

For the Zolotarev metric Ca defined in section[2]below Neininger and Riischendorf [T0| obtained 
the order 

C3(Yn,Y).e(i^|^). (5) 
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The techniques of [TU] are sufficiently sharp to obtain C2+a(Yn., Y) — 0((logn)/n) for aU a E (0, 1] 
as weU. Using inequahties between probability metrics, based upon ([5]), a couple of upper bounds 
for related distance measures between C(Yn) and C[Y) were obtained in section 3 of [10 . 

The results mentioned above bound distances between the distributions of the Yn and Y. 
However, the embedding of the on one probability space allows to measure the approximation 
given by the martingale convergence Y^ — > Y as well. Very recently, Bindjeme and Fill 1^ started 
quantifying the almost sure convergence Y^ — > Y by identifying the La-distance between Yn and 
Y exactly and asymptotically: 

\ ^ ^ k=n+l / 

In the present note the error term Y^ — Y is further studied with respect to its asymptotic 
distribution: 

Theorem 1.1. Let (U^)^>i be a sequence of independent and identically uniformly on [0,1] dis- 
tributed data. For the number Kn of key comparisons needed by the Quicksort algorithm to sort 
the list (Ui , . . . , Un) and the almost sure limit Y of Y^ defined in |Ip we have, as n — > oo, 

./^^(Yn-Y)^AA. 
V 21ogn 

The methods used for the proof of Theorem 11.11 also imply convergence of the third absolute 
moments, which yields an asymptotic for the La-distance between Y^ and Y: 

Corollary 1.2. For the normalized number of key comparisons needed by the Quicksort algo- 
rithm and its almost sure limit Y as in Theorem ] 1.1\ we have, as n — > oo, 

Notation. Throughout, by — ^ convergence in distribution is denoted, by a random variable 
with the standard normal distribution. The Bachmann-Landau symbols are used in asymptotic 
statements. By logx the natural logarithm of x is denoted, x > 0. Moreover xlogx := for x = 
is used. 

Acknowledgment 

1 thank Henning Sulzbach and Kevin Leckey for comments on a draft of this note. 

2 Proof 

The outline of the proof is as follows: First an explicit construction of Yn. and Y is recalled, which 
leads to a sample-pointwise recurrence relation for Yn — Y. Then ideas from the contraction method 
are used for this recurrence. Compared to standard applications of the contraction method, mainly 
additional dependencies between the arising random variables need to be controlled. This is done 
by use of inequalities for the Zolotarev metric. The convergence in Theorem 11.11 then is shown 
within the Zolotarev metric, which implies the stated convergence in distribution. 

2.1 Almost sure construction 

An explicit construction for the limit distribution £(Y) was given by Rosier |14j and recently linked 
to the martingale limit Y by Bindjeme and Fill [I]. Since below, as a starting point, the same 
recursive equation (|12p for Yn — Y is used as in [1] , for the readers convenience, some notation is 
adopted from there. 
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Consider the rooted complete infinite binary tree, where the root is labeled by the empty word 
e and left and right children of a node labeled d are labeled by the extended words dO and ^^ 
respectively. The set of labels is denoted by :— UJ^qIO, 1}'^. The length of a label of a 
node is identical to the depth of the node in the rooted complete infinite binary tree. Now the 
sequence of keys (Uj)j>i is inserted into the rooted infinite binary tree according to the binary 
search tree algorithm: The first key Ui is inserted in the root and occupies the root. Then we 
successively insert the following keys, where each key traverses the occupied nodes starting at the 
root. Whenever the key traversing is less than the occupying key at a node it moves on to the left 
child of that node, otherwise to its right child. The first empty node visited is occupied by the 
key. General references for search tree algorithms are Knuth [7] or Mahmoud [8]. We denote by 
Yg the key which occupies the node labeled d. In particular we have Ve = Ui . 

Further, we associate with each node labeled d an interval Ig defined recursively: We set :— 
[0,1]. Assume that 1^ — [Le,Re] is already defined for some -9 G 0. Then we set I^o '■— [l-g,Ve] 
and Ig 1 :— [Ve , R^] . Note that by construction we always have Vg e le . The relative positions of 
Ya within are crucial: We denote the interval lengths by cp^ :— — and 

T.:=^^ = ^, ^GG. 
Rb - U Vd 

By construction, (T^jseo is a family of independent, identically uniformly on [0,1] distributed 
random variables. Furthermore, with the function C defined in ([3]) we set 

Gb := (pflC(Te). 

In a related setting Rosier [14] showed that the series 

oo 

L L (7) 

j=o«eo,|«|=j 

is convergent in any Lp and that it has the same distribution as the martingale limit Y. 

Bindjeme and Fill [T] showed that the random variable in (O is even almost surely identical 
to Y. Moreover, they use the latter construction to give the following sample-pointwise extension 
of the distributional identity © for Y: Roughly, the left and right subtree of the root, i.e., the 
complete infinite binary trees rooted at the nodes labeled and 1 get all their nodes' interval 
lengths renormalized by 1/Ui and 1/(1 — Ui ) respectively. This unwinds the original dependence 
between interval lengths from nodes of the left and right subtree induced from lli and allows an 
almost sure construction of the distributional identity ([2]). Formally, with the root key Ve = Ui 
define for all t) G 8 

(0) 1 _ Vos (1) 1 _ <Pie 



and, 



g|,'' :=cp|,^'C(Tt«], Y(^':=^ ^ G^'K ie{0,1}. 

j=o«eo,|5|=j 

Then, cf. Proposition 2.1 in T, we have 

Y = UiY<°' + (1 -Ui)Y'" + C{Ui), (8) 

and Ui , Y'°', Y'^' are independent and Y'°' and Y^^' have the same distribution as Y. 

Now, we denote by In the number of keys among Ui , . . . , Un that are inserted in the left 
subtree of the root, i.e., the subtree rooted at the node labeled 0. Note that conditional on 
{Ui = u} we have for I^ a binomial B(n — 1,u) distribution. Furthermore, denote by Kn.,o and 
Kn,i the number of key comparisons used to sort the left and right sublists r< and r> generated 
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when first partitioning (Ui,...,Un). Note that the sizes of r< and r> are In and n — 1 — In 
respectively. Since the first partitioning phase of Quicksort requires n — 1 key comparisons we 
have, for all n > 1 

Kn = Kn,o + Knj +n-1. (9) 

Recall the normalization ^ for Kn- Hence, with \x[n) := E[Kn] we define normalizations of Kn,o 
and Kn,i by 

Kn,0-H(ln) ^ Kn,l -M.(n-1 -In) 

In + 1 ' "i^^ • ^^^> 

(To be definite about the notation, we have |J.(In) — IE[Ki^ | In] and in general |J.(In) 7^ IE[Ki^].) 
Note that conditional on {In = j} we have that Yn,o and Yn,i are independent and have the same 
distribution as Yj and Yn-i-j respectively. From ([T]), ([9]) and (fTO|) we obtain the (sample-pointwise) 
recurrence 

Yn = ^^Yn,0 + ^^-^Yn,l +^Cn(In + 1), n>1, (11) 

n + 1 n + I n + 1 

where, for 1 < i < n 

Cn (i) := - ( H(i - 1 ) + ^(n - i) - ^(n) + n - 1 ) . 
n 

Altogether, and pT|) yield a recurrence for the error term under consideration in Theorem ll.il 
for all n > 1, cf. equation (2.6) in [J, 

Yn - Y = 1^ (Yn,o - Y(°') + ^ (Yn,l - Y^ >) + (hl±l u, ) Y<°' 
n+l V / n+1 V / \n+1 / 

- (1 -U,)) Y(i' + ^Cndn + 1) - C(Ui ). (12) 
n+l / TL + 1 



Note that Yn — Y is already centered and has variance, see (O 



cT2(n):=Var(Yn-Y) = |lYn-Y|l2~^^, (n ^ oo) (13) 



and cr(n] > for all n > 0. Hence, with the scaling 



Xn:=^, n>0, (14) 



we obtain, for all n > 1 

1 



where 



. (n) (In + 1)g(In) (n) ,_ (t1- In)g(n- 1 - I,. 

° '~ (n+1)ff(n) ' 1 (n+1)ff(n) 



Cn(In + l)-C{Ui)V (16) 



n+l 
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The asymptotic of cr(n) in ((T5)) and that 1^ is conditionally on {Ui = u} binomial B(n — l,u) 
distributed imply together with the strong law of large numbers and dominated convergence that, 
as n — > oo , 



->o, 



A - v/T^ 







(17) 



for all p > 1 . 



Remark. Note that convergence theorems from the contraction method, e.g.. Corollary 5.2 in 
[11] do in general not apply to recurrence (fT5|). The reason is that each of the random variables 



- Y 



(0) 



1 



crfn — 1 — 



is conditionally on In still (stochastically) dependent on b'"^', via the joint occurrence of Y'°' 
and y'^' respectively. Though, in typical theorems from the contraction method conditional 
independence is assumed. 



2.2 The Zolotarev metric 

The proof of Theorem 1 1 . 1 1 below is based on showing appropriate convergence within the Zolotarev 
metric and using that convergence in the Zolotarev metric implies weak convergence. The Zolotarev 
metric has been studied in the context of distributional recurrences systematically in jllj . We col- 
lect the properties that are used subsequently, which can be found in Zolotarev [TSJ HB] if not 
stated otherwise. For distributions C[V), £(W) on R the Zolotarev distance Csi s > 0, is defined 
by 

Cs(V,W] := Cs[C[V),C[W]] sup |E[f (V) - f (W)]| (18) 

where s = m + a with < a < 1 and m G Nq. Here, 

J-s :={f e C""(M,R) : |f'"^'(x) - f'"^l(ij)| < [x-yH, (19) 

denotes the space of ra times continuously differentiable functions from M to M such that the ra-th 
derivative is Holder continuous of order a with Holder-constant 1. We have that Cs(X W) < oo, if 
all moments of orders 1 , . . . , m of V and W are equal and if the s-th absolute moments of V and W 
are finite. Since later on only the case s = 3 is used, for finiteness of Cs (V, W) it is thus sufficient 
that mean and variance of V and W coincide and both have a finite absolute moment of order 3. 
A pair (V, W] satisfying these moment assumptions subsequently is called (^^- compatible, a term 
not in use elsewhere. In particular, for fixed |a. G M and cr > within the space of distributions 

X3(^,(J2) :^{C[V) : E[V] = ^,Var(V) = cT^E[|Vp] < oo} 

all pairs are Cs-compatible and [M3[[i, cr^), Cs) is a complete metric space. For the completeness 
(not used subsequently) see [21 Theorem 5.1]. Convergence in implies weak convergence on R. 
Furthermore, Cs is (3,-|-) ideal, i.e., 

C3(V + Z,W + Z) < C3(V,W), C3(cV,cW) = c3C3(V,W) 

for all Z being independent of (V, W) and all c > 0. This in particular implies for independent 
(VijVi), (WijWa) such that both pairs are Cs-compatible 

C3 ( V, + W, , V2 + W2 ) < C3 ( V, , V2 ) + C3 ( W, , W2 ) . (20) 

The metric Cs can be upper bounded in terms of the minimal L3-metric £3 defined in ([4]): For 
2,3-compatible (V, W) we have, see [TUl Lemma 2.1], 

C3(V,W) < ^(||V||2 + ||V||3||W||3 + ||W||2)£3(V,W). (21) 
Finally, a substitute for (|20p when the independence assumption there is violated is later used: 
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Lemma 2.1. Let Vi,V2,Wi,W2 be random variables such that (Vi,V2) is Ci- compatible and 
(Vi + Wi , V2 + W2] is Cs- compatible. Then we have 



C3(Vl +W,,V2+W2) < C3(Vl,V2)+^] ||Vi|||||Wij|3 + 



1=1 



(22) 



Proof. First note that for all f G J's and g defined by g(x] := f(x) — f (0)x — f"(0)x^/2 for 
X G R we have for all Cs-compatible pairs (V, W) that E[f(V) - f(W)] = E[g(V) - g(W)]. Since 
g'(0) = g"{0) = we hence have 

C3(V,W) = sup |E[f(V) -f(W)]| = sup |E[g(V) - g(W)]|, 

with {g e : g'(0) = g"(0) = 0}. 

For g e we have the Taylor expansion g(x + h] = g(x) + g'(x)h + g"(x)h^/2 + R(x, h) for 
all X, h e M with lR(x, h)| < |h.|^/6. Hence, with Vi,Wi,V2,W2 as in the Lemma we obtain 



C3(Vi + Wi , V2 + W2) = sup |E[g(V, + Wi ) - g(V2 + W2) 



sup 



E 



g(Vi) + g'(V,)Wi + 



'{Vi)W2 



R(V,,Wi 



g(V2) + g'(V2)W2 + 



^M + R(v.w2: 



< C3(Vi,V2) + B, 



with 



B := sup 



E 



g'(Vi)Wi 



g"(Vi)wf 



R(Vi,Wi; 



g'(V2]W2 + 



^^^ + R(V2,W2) 



< sup X WoWdl + + 

Since g'(0] — g"(0] = and g" is Lipschitz-continuous with Lipschitz-constant 1 we obtain for 
all X e M that |g"(x)| = |g"(x) - g"(0)| < |x| and |g'(x)| = |g'(x) - g'(0)| < |g"(Ox| < x^ for 
appropriate f, between and x. Hence we obtain 



B < ^E 



i=l 



VflWil + 



+ 



Holder's inequality implies the assertion. 



□ 



2.3 Two more technical Lemmata 

The proof of Theorem 11.11 given below requires that b'"^' defined in tends to in the L3- 
norm. The following Lemma provides a quantitative estimate. We need well-known properties of 
the binomial distribution. Here, and subsequently, Bn,u denotes a binomial B(n, u) distributed 
random variable. Then we have Chernoff 's bound 

IP(|Bn,u - un| > tn) < 2exp (-2nt^) , 



7 



where upper and lower tail probabilities are each bounded by exp(— 2nt^). We further use that 
the convergence 



lim I 

n— >oo 



^rLu(l -u) 



E [\J\f\' 



(23) 



is uniform in u G (0, 1 ). 

Lemma 2.2. For b'"-' defined in l[T6\) we have, as n-^ oo, 

1 



= O 



Vlogn 



(24) 



Proof. We have 



lb<^l||, < 



o-(n] 



o-(n) 



+ 



In + 1 

n+1 
n 



Ui Y 



40) 



n- In 
n+1 



(1 -Ui) Y'll 



n+l 



Cn(In + 1)-C(Ui 



(S, +S2+S3). 



Note that the summands Si and S2 are identical. Moreover, we have that (In,Ui ) is independent 
of Y'°' and Y'^'. Hence, we have 



Si +S2 =2 



Ir 



1 



n + 1 



-Ui 



|Y||3. 



Recall that conditionally on {Ui = u} we have that !„. is binomial B(n — 1,u) distributed. The 
uniform convergence in p3p in particular implies that there exists a constant Mi > independent 
ofue (0,1) such that E[|(Bn-i,u + 1)/(n+1)-u|3] < Min-3/2. By dominated convergence we 
obtain ||(In + 1)/(n+1)-Ui||3 =0(1/VH:) and Si + S2 = 0(1/V^^). 

To bound the summand S3 note that for S3 — 0(1/-y/n) it is sufficient to show ||Cn(lTi + 1) ^ 
C(Ui)||3 = 0(1/Vn). We have 



ICndr 



11 



C(U 



1 J||3 



< 



Cn(In + 1)-C 



n- 1 



C 



n- 1 



C{Ui 



Note that we have ||Cn(In + 1) — C(In/(n— 1))||3 — 0((logn)/n) from Proposition 3.2 in Rosier 
[T4] . Hence, it remains to bound ||C(In/(n— 1)) — C(Ui )||3. Using symmetry in the terms xlogx 
and (1 — x) log(1 — x) appearing in C{x) and the triangle inequality, we have 



C 



In 



n- 1 



C(Ui 



< 4 



Ir 



n- 1 



log 



Ir 



[n-1)Ui 



+ 4 



Ir 



n- 1 



Ui 1 logU 



(25) 



To bound the first summand in the latter display we again use that conditional on {Ui — u} the 
random variable In has the Binomial B (n — 1 , u) distribution. Hence 



In 



n- 1 



log 



Ir 



[n-llUi 



E 



-TL— I ,U 



n- 1 



log 



Bn-l, 



fn-Du 



du. 



(26) 



To bound the expectation appearing as integrand in the latter display we consider for u e (0, 1 ) 
the event 



|Bn-l,i 



>^(n-r 
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Note that for the complement of Chernoff's bound yields PlEJj^) < exp(— (n— 1)u^/2). We 
denote h.(x) :— xlogx for x e [0, oo). With sup^gfo 1/21 \^M \ = 1 /e < 1 we bound the contribution 
on E;^ by 



TL— 1 ,11 



n-1 



log 



fn-nu 



< u exp 



Dn-l,u 

n - 1 lu^ 



On Eu we apply the mean value theorem to h(l + x) = h(l + x) — With 
appropriate constant M2 > we obtain 



(27) 
and an 



n— 1 ,u 



n- 1 



log 



H I 



Bn-l ,u 

(n-llu 



>Tl-l,U 



(n-Du 



fn-nu 



< 



< 



u fl — lofful 



fn-nu 



fn-nu 



1 



U3(1 -l0gu)3— -y- —— 

E u-^/^(n — 1 j-^/^ 



Bn-l,u - (n- 1]u 



V(n-1)u(1 -u) 



< M2 



u 



fl -logul^ 



(n- 1)3/2 

Hence, plugging (|27|) and ([28l) in ([26]) we obtain 



(28) 



Ir 



log 



Ir 



n-1 "V^TX-nU 





3 




) 


< 


u^ exp I 




3 


V 



fn - 1 W 



M2- 



u 



3/2(1 -10KU)3 



(n- 1)3/2 



du 



= 0(^1+0 



o 



n3/2 



(29) 



n3/2 



The second summand in (l25]l is bounded similarly to the estimate (l28l): 



n-1 



Ui logUi 



E 



n-1 



I logu|3 du 



;u{1 -u))3/2|logu|3 



= o 



(n- 1)3/2 
1 



E 



B._ 



n — 1 ,u 



fn-llu 



V(n-1)u(1 -u) 



du 



n 



3/2 / • 



Altogether, we have S3 = 0(1/-/rL), hence Si +S2 + S3 = 0(^1/-/rL)- Since crfn) = Cli^/Togn/ ^/n) 
the assertion follows. □ 

Remark. Using Okamoto's inequality [T^] to bound P(E5j^) for < u < 1/2 in the previous proof 
leads to an improvement of the term O ( 1 /n2 ) in to O ( 1 /n** ) . 



Moreover the subsequent proof of Theorem 11.11 requires a rough estimate for the Ls-norm 
|Yn — YII3. Note that the following Lemma [2.31 is improved later by Corollarv 11.21 
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Lemma 2.3. For the error term Yn — Y m Theorem \l.l\ we have, as n — > oo, 

||Yn-Y||3 =0 



logn 



(30) 



Proof. Since Y^ is a bounded random variable and Y has finite absolute moments of arbitrary 
order we have ||Yn — Y||3 < oo for all n > 0. Note that with Xn defined in the assertion pop 
is equivalent to E[|XnP] — 0(1). From P3|) we obtain 



with 



(n) 



ct(I, 



|Xnl< Ao + A, +|b 



cr(n — 1 — Ir 



Hence, we have for all n > 1 

IE[|XnH <E[A^] +E[A^] +E 



+ 3E 
+ 6E 



A^|b("' 
AoAilb' 



■3E 



Ib'^f 
Ao|b( 



3E [A§Ai] +3E [AoAf] 



+ 3E 



2|T,(n) 



At lb 



+ 31 



Ai|b( 



(31) 



We use the notation 



6n 1 V max ] 

o<i<Ti 



We start bounding the summands with E[Ao]. Conditionally on {!„. = j} we have that Aq^' is 
deterministic and |Yn,o ~ Y'°'|/cr(In^) is distributed as Xj for all < j < n — 1. Hence we obtain 



e[a3]<e[(a^-i)' 



(3n-l 



(32) 



and an analogous bound for E[Af]. The summand E[|b<^l|^] tends to zero by Lemma 12.21 For 
the summand E[AqAi] first note that again by conditioning on {1^ = }} we have independence of 
|Yn,o — Y'^'l/cfln.) and |Ynj — Y'^ ' |/o'(ti— 1 — In) with distributions of Xj and X^-i-j respectively. 
Since moreover < Ajf ' < 1 and < aI? ' < 1 we obtain 



E[A^Ai] < max ||Xj||2 

' 0<j<Tl— 1 



max llXj II 1 . 

0<j<n-l ' ' 



Note that © implies sup^>o ||Xrt||2 < oo, hence we haveE[A2Ai] = 0(1). Analogously, E[AoAf] = 
0(1 ). The summands in line pil) are all bounded by Holder's inequality, e.g., for the first of these 
summands we have, also using p2p and Lemma [ 



E 



A^lb'^'l 



Is < ^'{'illbf-'lls < (3n-l||b'"'||3 = 0(1)(3n-l. 



<l|Ao|||||b' 

The other summands in line (j3ip yield the same contribution. Finally, we similarly have 

E[AoAi|b(-'|] < ||Ao||3||Ai||3||b(-'||3=o(1)Pn-l. 

Collecting all terms we obtain 

,3 / , ,x31 



E[|Xn|3] 



< 



(a^) +(Ar)^ 



+ 0(1) |3n-l +0(1). 



(33) 



With the asymptotic pT|) this implies 

E[|XnP] < (e [ufV(1 -Ui)^/2 
This implies that E[|XnP] = 0(1) as n ^ oo. 



0(1) |3n-l +0(1) 



0(1) Pn-1 +0(1). 



□ 
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Remark. The argument of the proof of Lemma 12.31 can be extended by induction to show, as 
n — > oo, 



l|Yn-Y||p=0 



/logn 



n 



for any p > 1. A related induction argument for a bound of the minimal Lp -metric ^p(Y^, Y] is 
given in Fill and Janson ^ Section 3] . 

2.4 The proof 

Proof of Theorem We first define a "hybrid" random variable to connect between and a 
standard normal random variable as follows: For A/"'"' and A/"'^' independent standard normal 
random variables also independent of all other random variables, i.e., independent of (Ui)i>i , we 
set 

Q^:=a|,"'aA(°'+a5"'aA(1', n>1. 

Note that (fT7| with p —2 implies that Var(Qn) — > 1 as n ^ oo. Moreover, we have Var(Qn) > 
for all n > 1. Hence, there exists a sequence (Kt^)t^>i with — > as n ^ oo such that 
Var((l + Kt^)Q^) — 1 for all n > 1. Denoting by Af another standard normal random variable 
we have that each pair from the three random variables X^, (1 + k^)Q^ and Af is Cs-compatible. 
Thus, we can use the triangle inequality to obtain 

C3(Xn,AA) < C3(Xn, (1 + K^)Qn) + C3((1 + Kn)Qn,AA]. (34) 

We now abbreviate for i = 0, 1 and n > 1 

7(0) — _!_fY n-Y'°'l Z'^' — ^ fY i-Y'^'l 



and 



Then, Lemma 12.11 can be applied and yields 

C3(Xn, (1 + Kn)Qn) < C3(On,Qn) + || On |1 1|1 b'^' |1 3 + ^ || lU || b ||| + ^|lb(-'|!| 

+ (xn + ^K^ + ^K^^ IIQnIll. 

Note that by definition of we have sup^>^ IIQn||3 < oo. Moreover, Lemma [2.31 implies that 
supn>i |10n||3 < OO. Heucc, with Kn ^ and, by Lemma Ijbf'^'lls — ) we obtain, as n — > oo, 

C3(Xn,(1 +Kn)Qn) < C3 (O^, Qn) + o( 1 ) . (35) 

Next we show that for the second summand in (l34l) we also have C3((1 + Kn)Qn)A/') = o(1): 
First note that (fTTj) implies that the La-norm of (1 + Kn]Qn is uniformly bounded in n. Hence, 
the bound ([21]) implies C3((1 + Kn]Qn,A/') < M^sfl + Kn)Qn,J^] for aU n > and a constant 
M > 0. Using the uniform Ui in p7)) (that is also independent of A/"'*^' and A/"*^') we have that 
^/II7AA(°' + \/l -UiAA'i' has also the standard normal distribution. Hence we obtain 

< M 



(d + Kn)A<^' - VU7) AAl°' + ((1 + Kn)A<-l - V^^) Af^'' ^ 

0, (36) 
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by independence and (jl7p . 

Hence, we obtain from ([M)) . (|35p and (jH]) that 



(37) 



Now, note that conditional on {1^ — k}, we have that Zn°' and Zn' are independent with dis 



,(1) 



tributions of Xic and X^-i-k respectively, < k < n — 1. By (Xq°', . . . ,X^_^ 



AO) 



.(1) 



,X 



(1) 



(Xq >--->'V-i 



independent vectors with identical distribution (Xq, . . . , Xn-i ) are denoted. Thus, conditioning 
on In and using that C3 is (3, +)-ideal and (PO]) . we obtain 

C3(Af"'zi,°' +A5"'z<,i',Af"Wt°1 + A^^'AAtil) 



n-l 



< 



< 



1 

n — 

k=0 

k=0 I 
n-l 

n ^ — 



(1<+1)cT(k) (0) 

(n+1)(j(n) ^ 



(n-k)ff(n- 1 -k) 
(n+ 1)ff(n) 



X 



Tt-l-k' 



0<+2Mk^_^(o) ^ (Ti-k)g(n-l -k) ^^^n) 
(n+1)cT(n) (n+1)ff(n) 



k=0 



(k+1](T(k) 
(n+ 1)(j(n) 

(k+ 1](j(k] 



C3(Xk,AA) 



C3(Xk,AA). 



(n"k)(T(n- 1 -k) 
(n+ 1)ff(n] 



C3(Xn-l-k,A/'] 



With A(n) := C3(Xn,A/') we obtain from ([371) and ^ that 



A(n) < E 



(In + 1)ff(In) 

(n+ 1)ff(n) 



A{lr 



■0(1). 



(38) 



(39) 



Now, a standard argument implies A[n) -4 as follows: Note that cr(n) ~ -^/Z log (n)/n and that 
In is distributed uniformly on {0, ... ,n — 1} imply for U uniformly distributed on [0, 1] 



E 



(In + 1)CT(IT 



[n+^)a[n 
First we use p9)) for a rough bound: 

■(In + 1)CT(I 



E 



(40) 



A(n) < 



(n+ 1)(T(n) 



sup A(k) + o(1). 

0<k<n-l 



In view of (^0]) this implies that (A(Ti))n.>o is bounded. We denote r\ := sup^>o A(n) < 00 and 
A :— limsup^^QQ A(n) > 0. For any e > there exists an tiq > such that A[n) < A + e for all 
n > no. Hence, from ([39| we obtain 



A(n) < E 

+ E 



{In<Tto}^ 



L{In>no)^ 



(In + 1)cT(In] 

(n+ 1)cj{n) 

(In + 1)g(In 

(n+ 1)(j(n) 



(A + E) + 0(1). 



With n — > 00 this implies 



A — lim sup A(n) < - (A 

n— >oo J 



(41) 



Since e > is arbitrary this implies A = 0. Hence, we have CslXr^A/") ^ as n — > 00. Since 
convergence in C3 implies weak convergence, the assertion follows. □ 
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Proof of Corollary [T^ Note that in the proof of Theorem 11.11 with Xn — [Yn ~Y)/a[n) the 
convergence CslXniA/") — > is shown. This impUes E[|Xn|^] — > E[|A/'P] as n ^ cxd, since the 
function x i— > |xp/6 is an element of J3. Hence we obtain 



||Y. - YII3 = cT(n)|lX^||3 ^ V^IIA^Ils = 

the assertion. □ 

Remark. The techniques for the proof of Theorem 11.11 can be adapted shghtly to also obtain 
Cz+cx.i'^niJ^) for all a E (0, 1]. This implies for all 2 < p < 3 the asymptotic 
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